endoscopic image
DGGAN: Degradation Guided Generative Adversarial Network for Real-time Endoscopic Video Enhancement
Xu, Handing, Nie, Zhenguo, Peng, Tairan, Pan, Huimin, Liu, Xin-Jun
Endoscopic surgery relies on intraoperative video, making image quality a decisive factor for surgical safety and efficacy. Yet, endoscopic videos are often degraded by uneven illumination, tissue scattering, occlusions, and motion blur, which obscure critical anatomical details and complicate surgical manipulation. Although deep learning-based methods have shown promise in image enhancement, most existing approaches remain too computationally demanding for real-time surgical use. To address this challenge, we propose a degradation-aware framework for endoscopic video enhancement, which enables real-time, high-quality enhancement by propagating degradation representations across frames. In our framework, degradation representations are first extracted from images using contrastive learning. We then introduce a fusion mechanism that modulates image features with these representations to guide a single-frame enhancement model, which is trained with a cycle-consistency constraint between degraded and restored images to improve robustness and generalization. Experiments demonstrate that our framework achieves a superior balance between performance and efficiency compared with several state-of-the-art methods. These results highlight the effectiveness of degradation-aware modeling for real-time endoscopic video enhancement. Nevertheless, our method suggests that implicitly learning and propagating degradation representation offer a practical pathway for clinical application.
- Health & Medicine > Diagnostic Medicine (0.96)
- Health & Medicine > Surgery (0.93)
- Health & Medicine > Therapeutic Area > Oncology (0.93)
DeepGI: Explainable Deep Learning for Gastrointestinal Image Classification
Houmaidi, Walid, Hadadi, Mohamed, Sabiri, Youssef, Chtouki, Yousra
This paper presents a comprehensive comparative model analysis on a novel gastrointestinal medical imaging dataset, comprised of 4,000 endoscopic images spanning four critical disease classes: Diverticulosis, Neoplasm, Peritonitis, and Ureters. Leveraging state-of-the-art deep learning techniques, the study confronts common endoscopic challenges such as variable lighting, fluctuating camera angles, and frequent imaging artifacts. The best performing models, VGG16 and MobileNetV2, each achieved a test accuracy of 96.5%, while Xception reached 94.24%, establishing robust benchmarks and baselines for automated disease classification. In addition to strong classification performance, the approach includes explainable AI via Grad-CAM visualization, enabling identification of image regions most influential to model predictions and enhancing clinical interpretability. Experimental results demonstrate the potential for robust, accurate, and interpretable medical image analysis even in complex real-world conditions. This work contributes original benchmarks, comparative insights, and visual explanations, advancing the landscape of gastrointestinal computer-aided diagnosis and underscoring the importance of diverse, clinically relevant datasets and model explainability in medical AI research.
- Europe (0.05)
- Africa > Middle East > Morocco (0.04)
- Health & Medicine > Therapeutic Area > Gastroenterology (1.00)
- Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Advancing Minimally Invasive Precision Surgery in Open Cavities with Robotic Flexible Endoscopy
Mattille, Michelle, Mesot, Alexandre, Weisskopf, Miriam, Ochsenbein-Kölble, Nicole, Moehrlen, Ueli, Nelson, Bradley J., Boehler, Quentin
Flexible robots hold great promise for enhancing minimally invasive surgery (MIS) by providing superior dexterity, precise control, and safe tissue interaction. Yet, translating these advantages into endoscopic interventions within open cavities remains challenging. The lack of anatomical constraints and the inherent flexibility of such devices complicate their control, while the limited field of view of endoscopes restricts situational awareness. We present a robotic platform designed to overcome these challenges and demonstrate its potential in fetoscopic laser coagulation, a complex MIS procedure typically performed only by highly experienced surgeons. Our system combines a magnetically actuated flexible endoscope with teleoperated and semi-autonomous navigation capabilities for performing targeted laser ablations. To enhance surgical awareness, the platform reconstructs real-time mosaics of the endoscopic scene, providing an extended and continuous visual context. The ability of this system to address the key limitations of MIS in open spaces is validated in vivo in an ovine model.
- North America > United States (0.68)
- Europe > Switzerland > Zürich > Zürich (0.15)
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
- (4 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Health & Medicine > Therapeutic Area (1.00)
- Health & Medicine > Surgery (1.00)
- Health & Medicine > Health Care Technology (1.00)
- (2 more...)
Lesion-Aware Visual-Language Fusion for Automated Image Captioning of Ulcerative Colitis Endoscopic Examinations
Escamilla, Alexis Ivan Lopez, Ochoa, Gilberto, Al, Sharib
We present a lesion-aware image captioning framework for ulcerative colitis (UC). The model integrates ResNet embeddings, Grad-CAM heatmaps, and CBAM-enhanced attention with a T5 decoder. Clinical metadata (MES score 0-3, vascular pattern, bleeding, erythema, friability, ulceration) is injected as natural-language prompts to guide caption generation. The system produces structured, interpretable descriptions aligned with clinical practice and provides MES classification and lesion tags. Compared with baselines, our approach improves caption quality and MES classification accuracy, supporting reliable endoscopic reporting.
- North America > Mexico (0.05)
- Europe > United Kingdom > England > West Yorkshire > Leeds (0.04)
- Europe > France (0.04)
- Health & Medicine > Therapeutic Area > Gastroenterology (1.00)
- Health & Medicine > Diagnostic Medicine > Imaging (0.98)
Vision Transformers for Kidney Stone Image Classification: A Comparative Study with CNNs
Reyes-Amezcua, Ivan, Lopez-Tiro, Francisco, Larose, Clement, Mendez-Vazquez, Andres, Ochoa-Ruiz, Gilberto, Daul, Christian
Kidney stone classification from endoscopic images is critical for personalized treatment and recurrence prevention. While convo-lutional neural networks (CNNs) have shown promise in this task, their limited ability to capture long-range dependencies can hinder performance under variable imaging conditions. This study presents a comparative analysis between Vision Transformers (ViTs) and CNN-based models, evaluating their performance on two ex vivo datasets comprising CCD camera and flexible ureteroscope images. The ViT-base model pretrained on ImageNet-21k consistently outperformed a ResNet50 baseline across multiple imaging conditions. For instance, in the most visually complex subset (Section patches from endoscopic images), the ViT model achieved 95.2% accuracy and 95.1% F1-score, compared to 64.5% and 59.3% with ResNet50. In the mixed-view subset from CCD-camera images, ViT reached 87.1% accuracy versus 78.4% with CNN. These improvements extend across precision and recall as well. The results demonstrate that ViT-based architectures provide superior classification performance and offer a scalable alternative to conventional CNNs for kidney stone image analysis.
- Europe > France > Grand Est > Meurthe-et-Moselle > Nancy (0.04)
- North America > Mexico > Jalisco > Guadalajara (0.04)
- Health & Medicine > Therapeutic Area > Urology (1.00)
- Health & Medicine > Therapeutic Area > Nephrology (0.89)
Enhanced Multi-Class Classification of Gastrointestinal Endoscopic Images with Interpretable Deep Learning Model
Kamble, Astitva, Bandodkar, Vani, Dharmadhikary, Saakshi, Anand, Veena, Sanki, Pradyut Kumar, Wu, Mei X., Jana, Biswabandhu
Endoscopy serves as an essential procedure for evaluating the gastrointestinal (GI) tract and plays a pivotal role in identifying GI-related disorders. Recent advancements in deep learning have demonstrated substantial progress in detecting abnormalities through intricate models and data augmentation methods.This research introduces a novel approach to enhance classification accuracy using 8,000 labeled endoscopic images from the Kvasir dataset, categorized into eight distinct classes. Leveraging EfficientNetB3 as the backbone, the proposed architecture eliminates reliance on data augmentation while preserving moderate model complexity. The model achieves a test accuracy of 94.25%, alongside precision and recall of 94.29% and 94.24% respectively. Furthermore, Local Interpretable Model-agnostic Explanation (LIME) saliency maps are employed to enhance interpretability by defining critical regions in the images that influenced model predictions. Overall, this work highlights the importance of AI in advancing medical imaging by combining high classification accuracy with interpretability.
- Asia > India (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- (2 more...)
- Health & Medicine > Therapeutic Area > Gastroenterology (1.00)
- Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Enhanced Scale-aware Depth Estimation for Monocular Endoscopic Scenes with Geometric Modeling
Wei, Ruofeng, Li, Bin, Chen, Kai, Ma, Yiyao, Liu, Yunhui, Dou, Qi
Scale-aware monocular depth estimation poses a significant challenge in computer-aided endoscopic navigation. However, existing depth estimation methods that do not consider the geometric priors struggle to learn the absolute scale from training with monocular endoscopic sequences. Additionally, conventional methods face difficulties in accurately estimating details on tissue and instruments boundaries. In this paper, we tackle these problems by proposing a novel enhanced scale-aware framework that only uses monocular images with geometric modeling for depth estimation. Specifically, we first propose a multi-resolution depth fusion strategy to enhance the quality of monocular depth estimation. To recover the precise scale between relative depth and real-world values, we further calculate the 3D poses of instruments in the endoscopic scenes by algebraic geometry based on the image-only geometric primitives (i.e., boundaries and tip of instruments). Afterwards, the 3D poses of surgical instruments enable the scale recovery of relative depth maps. By coupling scale factors and relative depth estimation, the scale-aware depth of the monocular endoscopic scenes can be estimated. We evaluate the pipeline on in-house endoscopic surgery videos and simulated data. The results demonstrate that our method can learn the absolute scale with geometric modeling and accurately estimate scale-aware depth for monocular scenes.
- Health & Medicine > Health Care Technology (1.00)
- Health & Medicine > Surgery (0.90)
- Health & Medicine > Diagnostic Medicine > Imaging (0.48)
Ulcerative Colitis Mayo Endoscopic Scoring Classification with Active Learning and Generative Data Augmentation
Çağlar, Ümit Mert, İnci, Alperen, Hanoğlu, Oğuz, Polat, Görkem, Temizel, Alptekin
Endoscopic imaging is commonly used to diagnose Ulcerative Colitis (UC) and classify its severity. It has been shown that deep learning based methods are effective in automated analysis of these images and can potentially be used to aid medical doctors. Unleashing the full potential of these methods depends on the availability of large amount of labeled images; however, obtaining and labeling these images are quite challenging. In this paper, we propose a active learning based generative augmentation method. The method involves generating a large number of synthetic samples by training using a small dataset consisting of real endoscopic images. The resulting data pool is narrowed down by using active learning methods to select the most informative samples, which are then used to train a classifier. We demonstrate the effectiveness of our method through experiments on a publicly available endoscopic image dataset. The results show that using synthesized samples in conjunction with active learning leads to improved classification performance compared to using only the original labeled examples and the baseline classification performance of 68.1% increases to 74.5% in terms of Quadratic Weighted Kappa (QWK) Score. Another observation is that, attaining equivalent performance using only real data necessitated three times higher number of images.
- North America > United States (0.04)
- Asia > Middle East > Republic of Türkiye > Ankara Province > Ankara (0.04)
A Framework For Automated Dissection Along Tissue Boundary
Oh, Ki-Hwan, Borgioli, Leonardo, Zefran, Milos, Chen, Liaohai, Giulianotti, Pier Cristoforo
Robotic surgery promises enhanced precision and adaptability over traditional surgical methods. It also offers the possibility of automating surgical interventions, resulting in reduced stress on the surgeon, better surgical outcomes, and lower costs. Cholecystectomy, the removal of the gallbladder, serves as an ideal model procedure for automation due to its distinct and well-contrasted anatomical features between the gallbladder and liver, along with standardized surgical maneuvers. Dissection is a frequently used subtask in cholecystectomy where the surgeon delivers the energy on the hook to detach the gallbladder from the liver. Hence, dissection along tissue boundaries is a good candidate for surgical automation. For the da Vinci surgical robot to perform the same procedure as a surgeon automatically, it needs to have the ability to (1) recognize and distinguish between the two different tissues (e.g. the liver and the gallbladder), (2) understand where the boundary between the two tissues is located in the 3D workspace, (3) locate the instrument tip relative to the boundary in the 3D space using visual feedback, and (4) move the instrument along the boundary. This paper presents a novel framework that addresses these challenges through AI-assisted image processing and vision-based robot control. We also present the ex-vivo evaluation of the automated procedure on chicken and pork liver specimens that demonstrates the effectiveness of the proposed framework.
- North America > United States > Illinois > Cook County > Chicago (0.05)
- North America > Canada > Ontario > Waterloo Region > Waterloo (0.04)
- Europe > Spain > Galicia > Madrid (0.04)
- (2 more...)
- Health & Medicine > Surgery (1.00)
- Health & Medicine > Health Care Technology (1.00)
Improving automatic endoscopic stone recognition using a multi-view fusion approach enhanced with two-step transfer learning
Lopez-Tiro, Francisco, Villalvazo-Avila, Elias, Betancur-Rengifo, Juan Pablo, Reyes-Amezcua, Ivan, Hubert, Jacques, Ochoa-Ruiz, Gilberto, Daul, Christian
This contribution presents a deep-learning method for extracting and fusing image information acquired from different viewpoints, with the aim to produce more discriminant object features for the identification of the type of kidney stones seen in endoscopic images. The model was further improved with a two-step transfer learning approach and by attention blocks to refine the learned feature maps. Deep feature fusion strategies improved the results of single view extraction backbone models by more than 6% in terms of accuracy of the kidney stones classification.
- North America > United States (0.14)
- Europe > France > Grand Est > Meurthe-et-Moselle > Nancy (0.04)
- South America > Peru > Cusco Department (0.04)
- North America > Mexico > Jalisco > Guadalajara (0.04)
- Health & Medicine > Therapeutic Area > Urology (0.76)
- Health & Medicine > Therapeutic Area > Nephrology (0.75)